New Instability Results for High Dimensional Nearest Neighbor Search

نویسنده

  • Chris Giannella
چکیده

Consider a dataset of n(d) points generated independently from R according to a common p.d.f. fd with support(fd) = [0, 1] d and sup{fd([0, 1] )} growing sub-exponentially in d. We prove that: (i) if n(d) grows sub-exponentially in d, then, for any query point ~q ∈ [0, 1] and any ǫ > 0, the ratio of the distance between any two dataset points and ~q is less that 1 + ǫ with probability → 1 as d → ∞; (ii) if n(d) > [4(1 + ǫ)] for large d, then for all ~q ∈ [0, 1] (except a small subset) and any ǫ > 0, the distance ratio is less than 1 + ǫ with limiting probability strictly bounded away from one. Moreover, we provide preliminary results along the lines of (i) when fd = N(~μd,Σd).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Nearest Neighbor Search in High-Dimensional Space

Similarity search in multimedia databases requires an efficient support of nearest-neighbor search on a large set of high-dimensional points as a basic operation for query processing. As recent theoretical results show, state of the art approaches to nearest-neighbor search are not efficient in higher dimensions. In our new approach, we therefore precompute the result of any nearest-neighbor se...

متن کامل

What Is the Nearest Neighbor in High Dimensional Spaces?

Nearest neighbor search in high dimensional spaces is an interesting and important problem which is relevant for a wide variety of novel database applications. As recent results show, however, the problem is a very di cult one, not only with regards to the performance issue but also to the quality issue. In this paper, we discuss the quality issue and identify a new generalized notion of neares...

متن کامل

Fast Nearest-Neighbor Search Algorithms Based on High-Multidimensional Data

Similarity search in multimedia databases requires an efficient support of nearest-neighbor search on a large set of high-dimensional points as a basic operation for query processing. As recent theoretical results show, state of the art approaches to nearest-neighbor search are not efficient in higher dimensions. In our new approach, we therefore pre-compute the result of any nearest-neighbor s...

متن کامل

Indexing the Solution Space: A New Technique for Nearest Neighbor Search in High-Dimensional Space

ÐSimilarity search in multimedia databases requires an efficient support of nearest-neighbor search on a large set of highdimensional points as a basic operation for query processing. As recent theoretical results show, state of the art approaches to nearest-neighbor search are not efficient in higher dimensions. In our new approach, we therefore precompute the result of any nearest-neighbor se...

متن کامل

On Optimizing Nearest Neighbor Queries in High-Dimensional Spaces

Nearest-neighbor queries in high-dimensional space are of high importance in various applications, especially in content-based indexing of multimedia data. For an optimization of the query processing, accurate models for estimating the query processing costs are needed. In this paper, we propose a new cost model for nearest neighbor queries in high-dimensional space, which we apply to enhance t...

متن کامل

On Optimizing Nearest Neighbor Queries in High-Dimensional Data Spaces

Nearest-neighbor queries in high-dimensional space are of high importance in various applications, especially in content-based indexing of multimedia data. For an optimization of the query processing, accurate models for estimating the query processing costs are needed. In this paper, we propose a new cost model for nearest neighbor queries in high-dimensional space, which we apply to enhance t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Process. Lett.

دوره 109  شماره 

صفحات  -

تاریخ انتشار 2009